Computing system with distributed compute-enabled storage group and method of operation thereof

ABSTRACT

A computing system includes: a storage device configured to perform in-storage processing with formatted data based on application data from an application; and return an in-storage processing output to the application for continued execution.

CROSS-REFERENCE TO RELATED APPLICATION(S)

This application claims the benefit of U.S. Provisional PatentApplication Ser. No. 62/098,530 filed Dec. 31, 2014, and the subjectmatter thereof is incorporated herein by reference thereto.

TECHNICAL FIELD

Embodiments relate generally to a computing system, and moreparticularly to a system with distribute compute-enabled storage.

BACKGROUND

Modern consumer and industrial electronics, such as computing systems,servers, appliances, televisions, cellular phones, automobiles,satellites, and combination devices, are providing increasing levels offunctionality to support modern life. These devices are moreinterconnected. Storage of information is becoming more of a necessity.

Research and development in the existing technologies can take a myriadof different directions. Storing information locally or over adistributed network is becoming more important. Processing efficiencyand inputs/outputs between storage and computing resources are moreproblematic as the amount of data, computation, and storage increases.

Thus, a need still remains for a computing system with distributedcompute-enabled storage group for ubiquity of storing and retrievinginformation regardless of the source of data or the request for thedata, respectively. In view of the ever-increasing commercialcompetitive pressures, along with growing consumer expectations and thediminishing opportunities for meaningful product differentiation in themarketplace, it is increasingly critical that answers be found to theseproblems. Additionally, the need to reduce costs, improve efficienciesand performance, and meet competitive pressures adds an even greaterurgency to the critical necessity for finding answers to these problems.

Solutions to these problems have been long sought but prior developmentshave not taught or suggested any solutions and, thus, solutions to theseproblems have long eluded those skilled in the art.

SUMMARY

An embodiment provides an apparatus, including: a storage deviceconfigured to perform in-storage processing with formatted data based onapplication data from an application; and return an in-storageprocessing output to the application for continued execution.

An embodiment provides a method including: performing in-storageprocessing with a storage device with formatted data based onapplication data from an application; and returning an in-storageprocessing output from the storage device to the application forcontinued execution.

Certain embodiments of the invention have other steps or elements inaddition to or in place of those mentioned above. The steps or elementswill become apparent to those skilled in the art from a reading of thefollowing detailed description when taken with reference to theaccompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a computing system with distributed compute-enabled storagegroup in an embodiment of the present invention.

FIG. 2 is an example of an architectural view of a computing system witha distributed compute-enabled storage device.

FIG. 3 is an example of an operational view for a split function of thedata preprocessor.

FIG. 4 is an example of an operational view for a split+padding functionof the data preprocessor.

FIG. 5 is an example of an operational view for a split+redundancyfunction of the data preprocessor.

FIG. 6 is an example of an operational view for a mirroring function ofthe data preprocessor.

FIG. 7 is an example of an architectural view of the output coordinator.

FIGS. 8A and 8B are detailed examples of an operational view of thesplit and split+padding functions.

FIG. 9 is an example of an architectural view of the computing system inan embodiment.

FIG. 10 is an example of an architectural view of the computing systemin a further embodiment.

FIG. 11 is an example of an architectural view of the computing systemin a yet further embodiment.

FIG. 12 is an example of an operational view of the computing systemissuing device requests for in-storage processing in a centralizedcoordination model.

FIG. 13 is an example of an operational view of the computing systemissuing device requests for in-storage processing in a decentralizedcoordination model.

FIG. 14 is an operational view for the computing system of a centralizedcoordination model.

FIG. 15 is an operational view for a computing system in a decentralizedmodel in an embodiment with one output coordinator

FIG. 16 is an operational view of a computing system in a decentralizedmodel in an embodiment with multiple output coordinators.

FIG. 17 is an example of a flow chart for the request distributor andthe data preprocessor.

FIG. 18 is an example of a flow chart for a mirroring function forcentralized and decentralized embodiments.

FIG. 19 is a flow chart of a method of operation of a computing systemin an embodiment of the present invention.

DETAILED DESCRIPTION

Various embodiments provide a computing system for efficient distributedprocessing by providing methods and apparatus for performing in-storageprocessing with multiple storage devices with capabilities forperforming in-storage processing of the application data. An executionof an application can be shared by distributing the execution amongvarious storage devices in a storage device. Each of the storage devicescan perform in-storage processing with the application data as requestedby an application request.

Various embodiments provide a computing system to reduce overall systempower consumption by reducing the number of inputs/outputs between theapplication execution and the storage device. This reduction is achievedby having the storage devices perform in-storage processing instead ofmere storage, read, and re-store by the application. Instead, thein-storage processing outputs can be returned as an aggregated outputfrom the various storage devices that performed the in-storageprocessing, back to the application. The application can continue toexecute and utilize the in-storage outputs, the aggregated output, or acombination thereof.

Various embodiments provide a computing system that reduces total costof ownership by providing formatting and translation functions for theapplication data for different configurations or organizations of thestorage device. Further, the computing system also provides translationfor the in-storage processing to be carried out by the various storagedevices as part of the storage group. Examples of types of translationor formatting include split, split+padding, split+redundancy, andmirroring.

Various embodiments provide a computing system that also minimizesintegration by allowing the storage devices to handle more of thein-storage processing coordination functions, with less being done bythe host execution the application. Another embodiment allows for thein-storage processing coordination to increasingly be located andoperate outside of both the host and the storage devices.

Various embodiments provide a computing system with more efficientexecution of the application with less interrupts to the application bycoordinating the outputs of the in-storage processing from the storagedevices. The output coordination can buffer the in-storage processingoutputs and can also sort the order of each of the in-storage processingoutputs before returning an aggregated output to the application. Theapplication can continue to execute and utilize the in-storage outputs,the aggregated output, or a combination thereof.

Various embodiments provide a computing system further minimizingintegration obstacles by allowing the storage devices in the storagegroup to have different or the same functionalities. As an example, oneof the storage devices can function as the only output coordinator forall the in-storage processing outputs from the other storage devices. Asa further example, the aggregation function can be distributed amongstthe storage devices, passing along from storage device to storage deviceand performing partial aggregation at each storage device, until a finalone of the storage devices returns the full aggregated output back tothe application. The application can continue to execute and utilize thein-storage outputs, the aggregated output, or a combination thereof.

The following embodiments are described in sufficient detail to enablethose skilled in the art to make and use the invention. It is to beunderstood that other embodiments may be evident based on the presentdisclosure, and that system, process, architectural, or mechanicalchanges can be made to the embodiments as examples without departingfrom the scope of the present invention.

In the following description, numerous specific details are given toprovide a thorough understanding of the invention. However, it will beapparent that the invention and various embodiments may be practicedwithout these specific details. In order to avoid obscuring anembodiment of the present invention, some well-known circuits, systemconfigurations, and process steps are not disclosed in detail.

The drawings showing embodiments of the system are semi-diagrammatic,and not to scale and, particularly, some of the dimensions are for theclarity of presentation and are shown exaggerated in the drawingfigures. Similarly, although the views in the drawings for ease ofdescription generally show similar orientations, this depiction in thefigures is arbitrary for the most part. Generally, an embodiment can beoperated in any orientation.

The term “module” referred to herein can include software, hardware, ora combination thereof in an embodiment of the present invention inaccordance with the context in which the term is used. For example, thesoftware can be machine code, firmware, embedded code, applicationsoftware, or a combination thereof. Also for example, the hardware canbe circuitry, processor, computer, integrated circuit, integratedcircuit cores, a pressure sensor, an inertial sensor, amicroelectromechanical system (MEMS), passive devices, or a combinationthereof. Additional examples of hardware circuitry can be digitalcircuits or logic, analog circuits, mixed-mode circuits, opticalcircuits, or a combination thereof. Further, if a module is written inthe apparatus claims section below, the modules are deemed to includehardware circuitry for the purposes and the scope of apparatus claims.

The modules in the following description of the embodiments can becoupled to one another as described or as shown. The coupling can bedirect or indirect without or with, respectively, intervening betweencoupled items. The coupling can be physical contact or by communicationbetween items.

Referring now to FIG. 1, therein is shown a computing system 100 with adata protection mechanism in an embodiment of the present invention. Thecomputing system 100 is depicted in FIG. 1 as a functional block diagramof the computing system 100 with a data storage system 101. Thefunctional block diagram depicts the data storage system 101 installedin a host computer 102.

Various embodiments can include the computing system 100 with devicesfor storage, such as a solid state disk 110, a non-volatile memory 112,hard disk drives 116, memory devices 117, and network attached storage122. These devices for storage can include capabilities to performin-storage processing, that is, to independently perform relativelycomplex computations at a location outside of a traditional system CPU.As part of the in-storage processing paradigm, various embodiments ofthe present inventive concept manage the distribution of data, thelocation of data, and the location of processing tasks for in-storageprocessing. Further, these in-storage computing enabled storage devicescan be grouped or clustered into arrays. Various embodiments manage theallocation of data and/or processing based on the architecture andcapabilities of these devices or arrays. In-storage processing isfurther explained later.

As an example, the host computer 102 can be as a server or workstation.The host computer 102 can include at least a central processing unit104, host memory 106 coupled to the central processing unit 104, and ahost bus controller 108. The host bus controller 108 provides a hostinterface bus 114, which allows the host computer 102 to utilize thedata storage system 101.

It is understood that the function of the host bus controller 108 can beprovided by central processing unit 104 in some implementations. Thecentral processing unit 104 can be implemented with hardware circuitryin a number of different manners. For example, the central processingunit 104 can be a processor, an application specific integrated circuit(ASIC) an embedded processor, a microprocessor, a hardware controllogic, a hardware finite state machine (FSM), a digital signal processor(DSP), a field programmable gate array (FPGA), or a combination thereof.

The data storage system 101 can be coupled to a solid state disk 110,such as a non-volatile memory based storage group having a peripheralinterface system, or a non-volatile memory 112, such as an internalmemory card for expanded or extended non-volatile system memory.

The data storage system 101 can also be coupled to hard disk drives(HDD) 116 that can be mounted in the host computer 102, external to thehost computer 102, or a combination thereof. The solid state disk 110,the non-volatile memory 112, and the hard disk drives 116 can beconsidered as direct attached storage (DAS) devices, as an example.

The data storage system 101 can also support a network attach port 118for coupling to a network 120. Examples of the network 120 can include apersonal area network (PAN), a local area network (LAN), a storage areanetwork (SAN), a wide area network (WAN), or a combination thereof. Thenetwork attach port 118 can provide access to network attached storage(NAS) 122. The network attach port 118 can also provide connection toand from the host bus controller 108.

While the network attached storage 122 are shown as hard disk drives,this is an example only. It is understood that the network attachedstorage 122 could include any non-volatile storage technology, such asmagnetic tape storage (not shown), storage devices similar to the solidstate disk 110, non-volatile memory 112, or hard disk drives 116 thatare accessed through the network attach port 118. Also, the networkattached storage 122 can include aggregated resources, such as just abunch of disks (JBOD) systems or redundant array of intelligent disks(RAID) systems as well as other network attached storage 122.

The data storage system 101 can be attached to the host interface bus114 for providing access to and interfacing to multiple of the directattached storage (DAS) devices via a cable 124 for storage interface,such as Serial Advanced Technology Attachment (SATA), the SerialAttached SCSI (SAS), or the Peripheral Component Interconnect—Express(PCI-e) attached storage devices.

The data storage system 101 can include a storage engine 115 and memorycache 117. The storage engine 115 can be implemented with hardwarecircuitry, software, or a combination thereof in a number of ways. Forexample, the storage engine 115 can be implemented as a processor, anapplication specific integrated circuit (ASIC), an embedded processor, amicroprocessor, a hardware control logic, a hardware finite statemachine (FSM), a digital signal processor (DSP), FPGA, or a combinationthereof.

The central processing unit 104 or the storage engine 115 can controlthe flow and management of data to and from the host computer 102, andfrom and to the direct attached storage (DAS) devices, the networkattached storage 122, or a combination thereof. The storage engine 115can also perform data reliability check and correction, which will befurther discussed later. The storage engine 115 can also control andmanage the flow of data between the direct attached storage (DAS)devices and the network attached storage 122 and amongst themselves. Thestorage engine 115 can be implemented in hardware circuitry, a processorrunning software, or a combination thereof.

For illustrative purposes, the storage engine 115 is shown as part ofthe data storage system 101, although the storage engine 115 can beimplemented and partitioned differently. For example, the storage engine115 can be implemented as part of in the host computer 102, implementedin software, implemented in hardware, or a combination thereof. Thestorage engine 115 can be external to the data storage system 101. Asexamples, the storage engine 115 can be part of the direct attachedstorage (DAS) devices described above, the network attached storage 122,or a combination thereof. The functionalities of the storage engine 115can be distributed as part of the host computer 102, the direct attachedstorage (DAS) devices, the network attached storage 122, or acombination thereof. The central processing unit 104 or some portion ofit can also be in the data storage system 101, the direct attachedstorage (DAS) devices, the network attached storage 122, or acombination thereof.

The memory devices 117 can function as a local cache to the data storagesystem 101, the computing system 100, or a combination thereof. Thememory devices 117 can be a volatile memory or a nonvolatile memory.Examples of the volatile memory can include static random access memory(SRAM) or dynamic random access memory (DRAM).

The storage engine 115 and the memory devices 117 enable the datastorage system 101 to meet the performance requirements of data providedby the host computer 102 and store that data in the solid state disk110, the non-volatile memory 112, the hard disk drives 116, or thenetwork attached storage 122.

For illustrative purposes, the data storage system 101 is shown as partof the host computer 102, although the data storage system 101 can beimplemented and partitioned differently. For example, the data storagesystem 101 can be implemented as a plug-in card in the host computer102, as part of a chip or chipset in the host computer 102, as partiallyimplement in software and partially implemented in hardware in the hostcomputer 102, or a combination thereof. The data storage system 101 canbe external to the host computer 102. As examples, the data storagesystem 101 can be part of the direct attached storage (DAS) devicesdescribed above, the network attached storage 122, or a combinationthereof. The data storage system 101 can be distributed as part of thehost computer 102, the direct attached storage (DAS) devices, thenetwork attached storage 122, or a combination thereof.

Referring now to FIG. 2, therein is shown an architectural view of acomputing system 100 with a distributed compute-enabled storage device.The architectural view can depict an example of relationships betweensome parts in the computing system 100. As an example, the architecturalview can depict the computing system 100 to include an application 202,an in-storage processing coordinator 204, and a storage group 206.

As an example, the storage group 206 can be partitioned in the computingsystem 100 of FIG. 1 in a number of ways. For example, the storage group206 can be part of or distributed among the data storage system 101 ofFIG. 1, the hard disk drives 116 of FIG. 1, the network attached storage122 of FIG. 1, the solid state disk 110 of FIG. 1, the non-volatilememory 112 of FIG. 1, or a combination thereof.

The application 202 is a process executing a function. The application202 can provide an end-user (not shown) function or other functionsrelated to the operation, control, usage, or communication of thecomputing system 100. As an example, the application 202 can be asoftware application executed by a processor, a central processing unit(CPU), a programmable hardware state machine, or other hardwarecircuitry that can execute software code from the software application.As a further example, the application 202 can be a function executedpurely in hardware circuitry, such as logic gates, finite state machine(FSM), transistors, or a combination thereof. The application 202 canexecute on the central processing unit 104 of FIG. 1.

The in-storage processing coordinator 204 manages the communication andactivities between the application 202 and the storage group 206. Thein-storage processing coordinator 204 can manage the operations betweenthe application 202 and the storage group 206. As an example, thein-storage processing coordinator 204 can translate information betweenthe application 202 and the storage group 206. Also for example, thein-storage processing coordinator 204 can—direct information flow andassignments between the application 202 and the storage group 206. As anexample, the in-storage processing coordinator 204 can include a datapreprocessor 208, a request distributor 210, and an output coordinator212.

As an example, the in-storage processing coordinator 204 or portions ofit can be executed by the central processing unit 104 or other parts ofthe host computer 102. The in-storage processing coordinator 204 orportions of it can also be executed by the data storage system 101. As aspecific example, the storage engine 115 of FIG. 1 can execute thein-storage processing coordinator 204 or portions of it. The hard diskdrives 116 of FIG. 1, the network attached storage 122 of FIG. 1, thesolid state disk 110 of FIG. 1, the non-volatile memory 112 of FIG. 1,or a combination thereof can execute the in-storage processingcoordinator 204 or portions of it.

The data preprocessor 208 performs data formatting of application data214 and placement of formatted data 216. The application data 214 is theinformation or data generated by the application 202. The formatting canenable storing the application data 214 as formatted data 216 acrossmultiple storage devices 218 for in-storage processing (ISP) to bestored in the storage group 206.

In-storage processing refers to the processing or manipulation of theformatted data 216 to be sent back to the application 202 or the systemexecuting the application 202. The in-storage processing is more thanmere storing and retrieval of the formatted data 216. Examples of themanipulation or processing as part of the in-storage processing caninclude integer or floating point math operations, Boolean operations,reorganization of data bits or symbols, or a combination thereof. Otherexamples of manipulating or processing as part of the in-storageprocessing can include search, sort, compares, filtering, combining theformatted data 216, the application data 214, or a combination thereof.

As a further example, the data preprocessor 208 can format theapplication data 214 from the application 202 and generate the formatteddata 216 to be processed outside or independent from execution of theapplication 202. This independent processing can be performed with thein-storage processing. The application data 214 can be independent ofand not necessarily the same format as those stored in the storage group206. The format of the application data 214 can be different than theformatted data 216, which will be described later.

Depending on the type of the application data 214, array configurationsof the storage group 206, or other user-defined policies, theapplication data 214 can be processed in various ways. As an example,the policies can refer to availability requirements so as to affect thearray configuration, such as mirroring, of the storage group 206. As afurther example, the policies can refer to performance requirements asto further affect the array configuration, such as striping, of thestorage group 206.

As examples of translation, the application data 214 can be translatedto the formatted data 216 using various methods, such as split,split+padding, split+redundancy, and mirroring. These methods can createindependent data sets of the formatted data 216 that can be distributedto multiple storage devices 218, allowing for concurrent in-storageprocessing. The concurrent in-storage processing refers to each of thestorage devices 218 in the storage group 206 being able to independentlyprocess or operate on the formatted data 216, the application data 214,or a combination thereof. This independent processing or operation canbe independent of the execution of the application 202, the otherstorage devices 218 of the storage group 206 that received some of theformatted data 216 from the application data 214, or a combinationthereof.

The request distributor 210 manages application requests 220 between theapplication 202 and the storage group 206. As a specific example, therequest distributor 210 accepts the application requests 220 from theapplication 202 and distributes them. The application requests 220 areactions between the application 202 and the storage group 206 based onthe in-storage processing. For example, the application requests 220 canprovide information from the application 202 to be off-loaded to thestorage group 206 for in-storage processing. Furthering the example, theresults of the in-storage processing can be returned to the application202 based on the application requests 220.

As an example, the request distributor 210 manages the applicationrequests 220 from the application 202 for in-storage processing, forwrite or storage, or for output. The request distributor 210 alsodistributes the application requests 220 from the application 202 acrossthe multiple storage devices 218 in the storage group 206.

As another example, incoming application requests 220 for in-storageprocessing can be split into multiple sub-application requests 222 toperform in-storage processing according to a distribution of theformatted data 216, organization of the storage group 206, or otherpolicies. The request distributor 210 can perform this split of theapplication request 220 for in-storage processing based on the placementscheme for the application data 214, the formatted data 216, or acombination thereof.

Example types of data placement schemes include a centralized scheme anddecentralized scheme, discussed from FIGS. 9 to 11. In variousembodiments in a centralized scheme, the data preprocessor 208 is placedinside the in-storage processing coordinator 204, while a decentralizedmodel places the data preprocessor 208 inside the storage group 206.

For the embodiments with a centralized scheme, once the in-storageprocessing coordinator 204 receives an application request 220 such as adata write request with required information, as an example, address,data, data length, and a logical boundary, from the application 202, therequest distributor 210 provides the data preprocessor 208 with therequired information such as data, data length, and logical boundary.Then, the data preprocessor 208 partitions the data into multiple datachunks of an appropriate size based on the store unit information. Then,the request distributor 210 distributes the corresponding data chunks toeach of the storage devices 218 with multiple sub-application requests222. The storage group 206, the storage devices 218, or a combinationthereof can receive the application requests 220, the sub-applicationrequests 222, or a combination thereof. On the other hand, the requestdistributor 210 in a decentralized model divides the data into apredefined size of chunks, for instance, data size/N, where N is thenumber of storage devices, and then distributes the chunks of data intoeach of the storage devices 218 with sub-application requests 222combined with the required information such as address, data length, anda logical boundary. Then, the data preprocessor 208 inside storagedevices 206 partitions the assigned data into smaller chunks based onthe store unit information.

As a further specific example, for a write request for the applicationdata 214, given the application data 214 to be written, its length, andan optional logical boundary of the application data 214, the requestdistributor 210 can send the write request to the data preprocessor 208so that it can determine how to distribute the application data 214.Once data distribution is determined, the request distributor 210 canissue the write request to the storage devices 218 in the storage group206. The host bus controller 108 of FIG. 8 or the network attach port118 of FIG. 1 can be used to execute the request distributor 210 andissue the application requests 220.

Continuing with the example, the storage devices 218 can perform thein-storage processing on the formatted data 216. The request distributor210 can process the request for output by forwarding the output requestto the in-storage processing coordinator 204, or as a specific exampleto the output coordinator 212, to send in-storage processing outputs 224back to the application 202 or the system executing the application 202.The application can continue to execute with the in-storage processingoutputs 224. The in-storage processing outputs 224 can be the results ofthe in-storage processing by the storage group 206 of the formatted data216. The in-storage processing outputs 224 are not a mere read-back orread of the formatted data 216 stored in the storage group 206.

The output coordinator 212 can manage processed data generated from eachof the multiple storage devices 218 of the storage group 206 and cansend it back to the application 202. As an example, the outputcoordinator 212 collects the results or the in-storage processingoutputs 224 and provides them to the application 202 or variousapplications 202 or the system executing the application 202. The outputcoordinator 212 will be described later.

The computing system 100 also can provide error handling capabilities.For example, when one or more of the storage devices 218 in the storagegroup 206 become inaccessible or has a slower performance, theapplication requests 220 can fail, such as time-outs or non-completions.For better availability, the computing system 100 can perform a numberof actions.

The following are examples for the application requests 220 for writesto the storage group 206. The in-storage processing coordinator 204, oras a more specific example the request distributor 210, can maintain arequest log that can be used to issue retries for the applicationrequests 220 that failed or were not completed. Also as an example, thein-storage processing coordinator 204 can keep retrying the applicationrequests 220 to write the application data 214. As a further example,the in-storage processing coordinator 204 can report that status of theapplication requests 220 to the application 202.

The following are examples for the application requests 220 forin-storage processing at the storage group 206. If one of the storagedevices 218 in the storage group 206 includes a replica of theapplication data 214, the formatted data 216, or a combination thereofas to the storage device 218 that was inaccessible, these applicationrequests 220 can be redirected to the storage device 218 with thereplica. If error recovery is possible, the error recovery process canbe executed prior to the previous failed application requests 220 beingreissued to the recovered storage device 218. An example of the errorrecovery technique can be a redundant array of inexpensive disk (RAID)recovery with rebuilding a storage device 218 that has been striped. Asother examples, the in-storage processing coordinator 204 can try theapplication requests 220 that previously failed. The in-storageprocessing coordinator 204 can also generate reports of failures even ifthe application requests 220 are redirected, retried, and eveneventually successful.

The in-storage processing coordinator 204 or at least a portion of itcan be implemented in a number of ways. As an example, the in-storageprocessing coordinator 204 can be implemented with software, hardwarecircuitry, or a combination thereof. Examples of hardware circuitry caninclude a processor, an application specific integrated circuit (ASIC)an embedded processor, a microprocessor, a hardware control logic, ahardware finite state machine (FSM), a digital signal processor (DSP),FPGA, or a combination thereof.

Referring now to FIG. 3, there is shown an example of an operationalview for a split function of the data preprocessor 208 of FIG. 2. FIG. 3depicts that application data 214 as input to the data preprocessor 208or more generally the in-storage processing coordinator 204 of FIG. 2.FIG. 3 depicts one example method of the data formatting performed bythe data preprocessor 208 as mentioned in FIG. 2. In this example, thedata formatting is a split function or a split scheme. FIG. 3 alsodepicts the formatted data 216 as the output of the data preprocessor208.

In this example, the amount of the application data 214 is shown to spana transfer length 302. The transfer length 302 refers to the amount ofdata or information sent by the application 202 to the data preprocessor208 or vice versa. The transfer length 302 can be a fixed size orvariable depending on what the application 202 transfers for in-storageprocessing.

Also in this example, the application data 214 can include applicationunits 304. The application units 304 are fields within or portions ofthe application data 214. Each of the application units 304 can be fixedin size or can be variable. As an example, the application units 304 canrepresent partitioned portion or chunks of application data 214.

As an example, the size of each of the application units 304 can be thesame across the application data 214. Also as an example, the size ofeach of the application units 304 across the application data 214 candiffer for different transfers of the application data 214. Further forexample, the size of each of the application units 304 can vary withinthe same transfer or across transfers. The application units 304 canalso vary in size depending on the different applications 202 sendingthe application data 214. The number of application units 304 can varyor can be fixed. The number of application units 304 can vary for thesame application 202 sending the application data 214 or betweendifferent applications 202.

FIG. 3 also depicts the formatted data 216 as the output of the datapreprocessor 208. The formatted data 216 can include formatted units 306(FUs). The formatted units 306 are fields within that formatted data216. In this example, each of the formatted units 306 can be fixed insize or can be variable. The size of the formatted units 306 can be thesame for the formatted data 216 or different for transfers of theformatted data 216, or can vary within the same transfer or acrosstransfers. The formatted units 306 can also vary in size depending onthe different applications 202 sending the formatted data 216. Thenumber of the formatted units 306 can vary or can be fixed. The numberof the formatted units 306 can vary for the same application 202 sendingthe formatted data 216 or between different applications 202.

FIG. 3 depicts the formatted data 216 after a split formatting with theformatted units 306 overlaid visually with the application units 304. Anexample storage application for this split formatting or split schemecan be with redundant array of inexpensive disk (RAID) systems as thestorage group 206, or with at least some of the multiple storage devices218 in the storage group 206. The in-storage processing, or even themere storage of the application data 214, can at least involve splittingthe application units 304 to different destination devices in thestorage group 206.

Continuing with this example, the data preprocessor 208 can split theapplication data 214 into a predefined fixed-length blocks referred toas the formatted units 306 and can give each block to one or more of themultiple storage devices 218 of the in-storage processing in a roundrobin fashion, as an example. The split scheme can generate non-aligneddata sets between the application data 214 and the formatted data 216.As a specific example, the data preprocessor 208 can generate thenon-alignment between the application units 304 relative to theboundaries for the formatted units 306.

Further with this example, FIG. 3 depicts an alternating formatting orallocation of the application units 304 to the different devices in thestorage group 206. In this example, the application units 304 aredepicted as “Data 1, “Data 2”, “Data 3”, and through “Data K”. Theformatted units 306 are depicted as “FU 1”, “FU 2”, and through “FU N”.

As a specific example, the formatted data 216 can have alternatinginstances targeted for one device or another device in the storage group206. In other words, for example, odd numbered “FUs” can be for drive 1and even numbered “FUs” can be for drive 0. The overlay of theapplication units 304 as “Data” is shown as not aligned with theboundaries of the “FU” and FIG. 3 depicts “Data 2” and “Data K” beingsplit between FU 1 (drive 1) and FU 2 (drive 0), again for this example.

As a further example, the formatted data 216 can also be stored on oneof the storage device 218 as opposed to being partitioned or allocatedto different instances of the storage devices 218 in the storage group206. In this example, the formatted units 306 can be sized for a sectorbased on a physical block address or a logical block address on one ofthe storage devices 218 as a hard disk drive or a solid state diskdrive.

As a specific example for the split function, the request distributor210 can initially send up to N in-storage processing applicationrequests 220 to the storage devices 218 in the storage group 206. Theterm N is an integer number. The application units 304 that are notaligned with the formatted units 306, such as “Data 2” and “Data K” inthis figure and example, can undergo additional processing at thestorage devices 218 with in-storage processing.

For example, the non-aligned application units 304 can be determinedafter initial processing of the application data 214 by the hostcomputer 102 of FIG. 1, the request distributor 210, or other storagedevices 218. The non-aligned application units 304 can by fetched by thehost computer 102 or the request distributor 210 allowing thenon-aligned application units 304 to be concurrently processed by thehost computer 102, the request distributor 210, the storage devices 218in the storage group 206, or a combination thereof. The non-alignedapplication units 304 can also be fetched by the host computer 102 orthe request distributor 210 such that these non-aligned applicationunits 304 can be written back to the devices for in-storage processing.Each of the storage devices 218 can send the results of the processednon-aligned application units 304 to the host computer 102, the requestdistributor 210, or the other storage devices 218 so the host computer102 or the other storage devices 218 can continue to process theapplication data 214.

Referring now to FIG. 4, therein is shown an example of an operationalview for a split+padding function of the data preprocessor 208 of FIG.2. FIG. 4 depicts the application data 214 as input to the datapreprocessor 208 as similarly described in FIG. 3. FIG. 4 also depictsthe formatted data 216 as the output of the data preprocessor 208. FIG.4 depicts one example method of the data formatting performed by thedata preprocessor 208 as mentioned in FIG. 2. In this example, the dataformatting is a split+padding function or split+padding scheme.

In this example, the split+padding function by the data preprocessor 208adds data pads 402 to align the application units 304 to the formattedunits 306. The alignment of the application units 304 and the formattedunits 306 can allow the request distributor 210 of FIG. 2 to send up toK independent in-storage processing application requests 220 to multiplestorage devices 218 of FIG. 2 in the storage group 206 of FIG. 2. Theterm K is an integer. In other words, the alignment allows for each ofthe multiple storage devices 218 to perform in-storage processing of theapplication units 304, the formatted units 306, or a combination thereofindependently without requiring further formatting or processingrequired on the formatted data 216.

As a specific example, each of the formatted units 306 includes one ofthe application units 304 plus one of the data pads 402. Each of thedata pads 402 aligns each of the application units 304 to the boundariesof each of the formatted units 306. The data pads 402 can also provideother functions or include other information. For example, the data pads402 can include error detection or error correction information, such asparity, ECC protection, meta-data, etc.

The data pads 402 can be placed or located in a number of differentlocations within the formatted units 306. For example, one of the datapads 402 can be located at the end of one of the application units 304as shown in FIG. 4. Also for example, each of the data pads 402 can alsobe located at the beginning of each of the formatted units 306 andbefore each of the application units 304. Further for example, each ofthe data pads 402 can be distributed, uniformly or non-uniformly, acrosseach of the formatted units 306 and within each of the application units304.

As an example, the size of each of the data pads 402 can depend on thedifference in size between each of the application units 304 and each ofthe formatted units 306. The data pads 402 can be the same for each ofthe formatted units 306 or can vary. Further for example, the term sizecan refer to the number of bits or symbols for the formatted units 306,the application units 304, or a combination thereof. The term size canalso refer to the transmission time, recording time, or a combinationthereof for the formatted units 306, the application units 304, or acombination thereof.

In this example, the size of the application data 214 is shown to spanthe transfer length 302 as similarly described in FIG. 3. In thisexample, the application units 304 are depicted as “Data 1”, “Data 2″”,“Data 3”, and through “Data K”. In this example, the formatted units 306are depicted as “FU 1”, “FU 2”, and through “FU N”.

FIG. 4 depicts the formatted data 216 after a split+padding formattingwith the formatted units 306 overlaid visually with the applicationunits 304 and with the data pads 402. An example storage application forthis split+padding formatting or scheme can be with use of redundantarray of inexpensive disk (RAID) systems as the storage group 206 orwith at least some of the multiple storage devices 218 in the storagegroup 206. The in-storage processing or even the mere storage of theapplication data 214 can at least involve splitting+padding theapplication units 304 to different destination devices in the storagegroup 206.

Continuing with this example, the data preprocessor 208 can split theapplication data 214 with the data pads 402 into a predefined length,and gives each length to one or more of the storage devices 218 forin-storage processing. A length can include any number of theapplication units 304. As a specific example, the formatted data 216 ofany length can be targeted for one or more of the multiple storagedevices 218 in the storage group 206.

Referring now to FIG. 5, therein is shown an example of an operationalview for a split+redundancy function of the data preprocessor 208 ofFIG. 2. FIG. 5 depicts the application data 214 as input to the datapreprocessor 208 as similarly described in FIG. 3. The application data214 can include the application units 304.

FIG. 5 also depicts the formatted data 216 as the output of the datapreprocessor 208 as similarly described in FIG. 3. The formatted data216 can include the formatted units 306.

In this example, the split+redundancy function can process the alignedand non-aligned application units 304. The in-storage processing in eachof the storage devices 218 of FIG. 2 in the storage group 206 of FIG. 2can process the aligned application units 304 separately, thenon-aligned application units 304 separately, or both at the same time.

In this example, the data preprocessor 208 is performing thesplit+redundancy function or the split+redundancy scheme. As part ofthis function, the split+padding function can split the application data214 to formatted data 216 of fixed length, variable length, or acombination thereof.

Also part of the split+redundancy function is the redundancy function.For the redundancy function as the example, the data preprocessor 208does not necessarily need to manipulate the application data 214, theapplication units 304, or a combination thereof that are non-aligned tothe formatted units 306 as the split function described in FIG. 3. Thisis depicted as the first row of the formatted data 216 in FIG. 5 and isredundancy data 502. The formatted data 216 generated from thesplit+redundancy function includes the redundancy data 502.

As an example, the redundancy data 502 can be an output of the datapreprocessor 208 mapping the application data 214, or as a more specificexample the application units 304 to the formatted data 216 and acrossthe formatted units 306 even with some of the application units 304nonaligned with the formatted units 306. In other words, some of theapplication units 304 fall within the boundary of one of the formattedunits 306 and these application units 304 are considered aligned. Otherinstances of the application units 304 traverses multiple instances ofthe formatted units 306 and these application units 304 are considerednonaligned. As a specific example, the application units 304 depicted as“Data 2” and “Data K” each span across two and adjacent instances of theformatted units 306.

Also as an example, the split+redundancy function can also perform thesplit+padding function to some of the application units 304. The datapreprocessor 208 can store the application units 304 that are notaligned to the formatted units 306. This is depicted in the second rowof the formatted data 216 of FIG. 5 and is an aligned data 504. Forthese particular, non-aligned application units 304, the datapreprocessor 208 can perform the split+padding function as described inFIG. 4 to form the aligned data 504. In the example depicted in FIG. 5,the application units 304 “Data 2” and “Data K” are not aligned to ortraverses multiple instances of the formatted units 306. The aligneddata 504 generated by the data preprocessor 208 includes the data pads402 to these instances of the nonaligned application units 304 in theredundancy data 502.

In this example, the split+redundancy function allows the in-storageprocessing coordinator 204 to send up to N+M requests to the storagedevices 218 in the storage group 206. Both N and M are integers. Nrepresents the number of formatted units 306 in the redundancy data 502.M represents the additional formatted units 306 in the aligned data 504.For the in-storage processing in each of the storage devices 218, thenon-aligned application units 304 in the redundancy data 502 can beignored.

Referring now to FIG. 6, therein is shown an example of an operationalview for a mirroring function of the data preprocessor 208 of FIG. 2.FIG. 6 depicts the formatted data 216 as the output of the datapreprocessor 208 as similarly described in FIG. 3. The formatted data216 of FIG. 2 can include the formatted units 306 of FIG. 3. Theapplication data 214 of FIG. 3 can be processed by the data preprocessor208.

When the application data 214 is mirrored in this example, at least someof the storage devices 218 of FIG. 2 can receive all of the applicationdata 214, which are replicated, or also referred to as mirrored. Theapplication data 214 that are replicated are referred to as replica data602. FIG. 6 depicts the multiple storage devices 218 as “Device 1”through “Device r” for the replica data 602. Replicated units 604 arethe application units 304 of FIG. 3 that are replicated and are shown as“Data 1”, “Data 2”, “Data 3” through “Data K” on “Device 1” through“Device r”. One of the storage devices 218 can store the applicationdata 214 as the formatted data 216 for that storage device 218. Some ofthe other storage devices 218 can store the replica data 602 and thereplicated units 604.

In this example, the data preprocessor 208 does not manipulate theapplication units 304 or the application data 214 as a whole. However,the data preprocessor 208 can collect or store mirroring information andthe application units 304. Also, the in-storage processing coordinator204 can receive the application data 214 or the application units 304from the application 202 when processing for efficient, concurrentin-storage processing.

The in-storage processing coordinator 204 or the data preprocessor 208can perform the mirroring functions in a number of ways. As an example,the in-storage processing coordinator 204 or the data preprocessor 208can take into account factors for mirroring the application data 214 tothe formatted data 216. One factor is the number of target devices fromthe multiple storage devices 218. Another factor is the size of theapplication data 214, the application units 304 of FIG. 3, or acombination thereof. A further factor is the size of the formatted data216, the formatted units 306, or a combination thereof.

Referring now to FIG. 7, therein is shown an example of an architecturalview of the output coordinator 212. As noted earlier, the outputcoordinator 212 manages the in-storage processing outputs 224 generatedfrom each of the multiple storage devices 218 of the storage group 206and sends it back to the application 202. The output coordinator 212 canmanage the interaction with the application 202 in a number of ways.

As an example, the output coordinator 212 function can be described asan output harvest 702, an output management 704, and an output retrieval706. The output harvest 702 is a process for collecting the in-storageprocessing outputs 224. For example, the output harvest 702 can collectthe in-storage processing outputs 224 from each of the storage devices218 and store them. The storage can be done locally where the outputharvest 702 is being executed. Also for example, the output harvest 702can collect the locations of the in-storage processing outputs 224 ineach of the storage devices 218.

The following are examples of various embodiments of how the outputcoordinator 212, or as a specific example the output harvest 702, cancollect the in-storage processing outputs 224 from the storage devices218. As an example, the output coordinator 212 can fetch the in-storageprocessing outputs 224 or their locations from each of the storagedevices 218 that performed the in-storage processing of the applicationdata 214 of FIG. 2, the formatted data 216 of FIG. 2, or a combinationthereof.

As an example, the output coordinator 212 can fetch the in-storageprocessing outputs 224 in a number of ways. For example, the outputcoordinator 212 can utilize a direct memory access (DMA) with thestorage devices 218. DMA transfers are transfer mechanisms not requiringa processor or a computing resource to manage the actual transfer oncethe transfer is setup. As another example, the output coordinator 212can utilized a programmed input/output (PIO) with the storage devices218. PIO transfers are transfer mechanism where a processor or computingresources manages the actual transfer of data and not just the setup andstatus collection at a termination of the transfer. As a furtherexample, the output coordinator 212 can utilize interface protocolcommands, such as SATA vendor specific commands, PCIe, DMA, or Ethernetcommands.

As an example, the storage devices 218 can send the in-storageprocessing outputs 224 to the output coordinator 212 in a number ofways. For example, the output coordinator 212 can utilize the DMA or PIOmechanisms. The DMA can be a remote DMA (rDMA) whereby the transfer is aDMA process from memory of one computer (e.g. the computer running theapplication 202) into that of another (e.g. one of the storage devices218 for the in-storage processing) without involving either one'soperating system or processor intervention for the actual transfer. Asanother example, the output coordinator 212 can utilize interfaceprotocol processes, such as background SATA connection or Ethernet.

Also for example, the storage devices 218 can send its respectivein-storage processing outputs 224 or their locations to the application202. This can be accomplished without the in-storage processing outputs224 passing through the output coordinator 212. For this example, thestorage devices 218 and the application 202 can interact in a number ofways, such as DMA, rDMA, PIO, back SATA connection, or Ethernet.

Regarding the output management 704, the output coordinator 212 canmanage the order of the outputs from the storage devices 218. The outputmanagement 704 manages the outputs based on multiple constraints, suchas size of output, storage capacity of output coordinator 212, and typesof the application requests 220 of FIG. 2. The outputs can be thein-storage processing outputs 224. As an example, the output management704 can order the outputs based on various policies.

As a specific example, the outputs or the in-storage processing outputs224 for each of the sub-application requests 222 of FIG. 2 forin-storage processing can be stored in a sorted order by a sub-requestidentification 708 per a request identification 710 for the in-storageprocessing. The request distributor 210 can transform the applicationrequest 220 of FIG. 2 into multiple sub-application requests 222 withthe formatted data 216 and distributes them to the storage devices 218.

After data processing in each of the storage devices 218, the outputcoordinator 212 gathers the in-storage processing outputs 224 from eachof the storage devices 218. The output coordinator 212 may need topreserve the issuing order of application requests 220, thesub-application requests 222, or a combination thereof even though thein-storage processing outputs 224 from the storage devices 218 can bedelivered to the output coordinator 212 in an arbitrary order becausethe data processing time of the storage devices 218 can be different.

As an example to implement this order, the storage group 206 of FIG. 2can assign a sequence number to each of the in-storage processingoutputs 224, where each of the in-storage processing outputs 224 alsocan be composed of multiple sub-outputs. For these sub-outputs, thestorage group 206 also assigns sequence numbers or sequenceidentifications. Once the output coordinator 212 receives each of thein-storage processing outputs 224 or sub-output data from each of thestorage devices 218, it can maintain each output's sequence therebysorting them by sequence numbers or identification. If the order of thein-storage processing outputs 224 or the sub-outputs is not importantfor application 202, the output coordinator 212 can send the in-storageprocessing outputs 224 in an out of order manner.

The request identification 710 represents information that can be usedto demarcate one of the application requests 220 from another. Thesub-request identification 708 represents information that can be usedto demarcate one of the sub-application requests 222 from another.

As an example, the sub-request identification 708 can be unique orassociated with a specific instance of the request identification 710.As a further example, the sub-request identification 708 can benon-constrained to a specific instance of the request identification710.

As a more specific example, the output coordinator 212 can include andoutput buffer 712. The output buffer 712 can store the in-storageprocessing outputs 224 from the storage devices 218. The output buffer712 can be implemented in a number of ways. For example, the outputbuffer 712 can be a hardware implementation of a first-in first-out(FIFO) circuit or of a linked list structure. Also for example, theoutput buffer 712 can be implemented with memory circuitry with thesoftware providing the intelligence for the FIFO operations, such aspointers, status flags, etc.

Also as a specific example, the outputs or the in-storage processingoutputs 224 for each of the sub-application requests 222 can be added tothe output buffer 712. The in-storage processing outputs 224 can befetched from the output buffer 712 as long as the output for the desiredinstance of the sub-application requests 222 is in the output buffer712. The sub-request identification 708 can be utilized to determinewhether the associated in-storage processing output 224 has been storedin the output buffer 712. The request identification 710 can also beutilized, such as an initial determination.

Continuing the example for various embodiments, the output coordinator212 can collect the in-storage processing output 224 from the storagedevices 218. To guarantee the data integrity of the in-storageprocessing outputs 224, the output coordinator 212 can maintain thesequence of each of the in-storage processing outputs 224 or sub-outputsdata in a correct order. For this, the output coordinator 212 canutilize the sub-request identification 708 or the request identification710 (e.g. if each of the in-storage processing output 224 of each of thestorage devices 218 also reuses the same identification as their outputsequence number or output sequence identification). Since the processingtimes of each of the storage devices 218 can be different, the outputcoordinator 212 can temporarily store each of the in-storage processingoutputs 224 or sub-output data into output buffer 712 to make them allsequential (i.e., correct data order). If there exists any missingin-storage processing output 224 or sub-output (that is, a hole in thesequence IDs), the application 202 cannot get the output data until allthe in-storage processing outputs 224 are correctly collected in theoutput buffer 712.

As a further specific example, the outputs or the in-storage processingoutputs 224 for each of the sub-application requests 222 can be sent tothe application 202 without passing through the output coordinator 212or the output buffer 712 in the output coordinator 212. In this example,the in-storage processing outputs 224 can be sent from the storagedevices 218 without being stored before reaching the application 202.

Regarding the output retrieval 706, once the output or the in-storageprocessing outputs 224 are known, the application 202 can retrieve thein-storage processing outputs 224 in a number of ways. In someembodiments, the output retrieval 706 can include the in-storageprocessing outputs 224 passing through the output coordinator 212. Inother embodiments, the output retrieval 706 can include the in-storageprocessing outputs 224 being sent to the application 202 without passingthrough the output buffer 712.

As an example, the outputs or the in-storage processing outputs 224 canbe passed from the storage devices 218 to the output coordinator 212.The output coordinator 212 can store the in-storage processing outputs224 in the output buffer 712. The output coordinator 212 can then sendthe in-storage processing outputs 224 to the application 202.

Also as an example, the outputs or the in-storage processing outputs 224can be passed from the storage devices 218 to the output coordinator212. The output coordinator 212 can send the in-storage processingoutputs 224 to the request distributor 210. The request distributor 210can send the in-storage processing outputs 224 to the application 202.In the example, the output buffer 712 can be within the outputcoordinator 212, the request distributor 210, or a combination thereof.

Further as an example, the outputs or the in-storage processing outputs224 can be passed from the storage devices 218 to the application 202.In this example, this transfer is direct without the in-storageprocessing outputs 224 to pass through the output coordinator 212, therequest distributor 210, or a combination thereof.

The output coordinator 212 can be implemented in a number of ways. Forexample, the output coordinator 212 can be implemented in hardwarecircuitry, such as a processor, an application specific integratedcircuit (ASIC) an embedded processor, a microprocessor, a hardwarecontrol logic, a hardware finite state machine (FSM), a digital signalprocessor (DSP), FPGA, or a combination thereof. Also for example, theoutput coordinator 212 can implemented with software. Further forexample, the output harvest 702, the output management 704, the outputretrieval 706, or a combination thereof can be implemented with hardwarecircuitry, with the examples noted earlier, or by software.

Similarly the request distributor 210 can be implemented in a number ofways. For example, the request distributor 210 can be implemented inhardware circuitry, such as a processor, an application specificintegrated circuit (ASIC) an embedded processor, a microprocessor, ahardware control logic, a hardware finite state machine (FSM), a digitalsignal processor (DSP), FPGA, or a combination thereof. Also forexample, the output coordinator 212 can implemented with software.

Referring now to FIGS. 8A and 8B, therein are shown detailed examples ofan operational view of the split and split+padding functions. FIGS. 8Aand B depict embodiments for an in-storage processing (ISP)-aware RAID.Various embodiments can be applied to an array configuration for thestorage devices 218 of FIG. 2 or the storage group 206 of FIG. 2.Examples of RAID functions include striping, mirroring, or a combinationthereof

FIGS. 8A and 8B depict examples of the application data 214 and theapplication units 304. The application units 304 can be processed by thein-storage processing coordinator 204 of FIG. 2. FIGS. 8A and 8B eachdepicts one example.

The example in FIG. 8A depicts the application data 214 undergoing thesplit function, similarly to the one described in FIG. 3. This depictioncan also represent a striping function in a RAID application.

The example in FIG. 8B depicts the application data 214 undergoing asplit+padding function, similarly to the one described in FIG. 4. Thisdepiction can also represent a striping function in a RAID applicationbut for various embodiments providing the in-storage processing forsplit+padding function.

Describing FIG. 8A, this part depicts the formatted data 216 and theformatted units 306. In this example, the formatted data 216 is splitand sent to two of the storage devices 218. Each of the formatted units306 includes one or more of the application units 304, such as FU0 inDEV1, which can include AU0 and AU1. These application units 304, suchas AU0, AU1, AU2, AU4, etc., can be each entirely contained in one ofthe formatted units 306 or traverse or span across multiple formattedunits 306, such as AU3, AU6, AU8, etc. As described in FIG. 3, some ofthe application units 304 are aligned with the formatted units 306 whileothers are not.

In this example, there are shown 10 of the application units 304 beingsplit into the formatted units 306 that are sent to two of the storagedevices 218. In this example application units 304 labeled as AU1, AU3,AU5, AU6, and AU8 are not aligned. These non-aligned application units304 can be identified with in-storage processing and separatelyprocessed by host systems or cooperatively with other storage devices218. Therefore, the application requests 220 of FIG. 4 for in-storageprocessing can be serialized and more complex request coordination couldbe required.

Describing FIG. 8B, this part depicts the formatted data 216 and theformatted units 306, as in FIG. 8A. As in the left-side, this exampledepicts the formatted data 216 being split in some form and sent to twoof the storage devices 218. In this example, each of the applicationunits 304, such as AU0, AU1, AU2, AU3, etc., can be aligned with one ofthe formatted units 306 with one of the data pads 402, as similarlydescribed in FIG. 4.

In this example for use in ISP-aware RAID, the application units 304 ispre-processed and aligned by split+padding policy, allowing each of theapplication requests 220 for in-storage processing to be independent.This independence can maximize the opportunity for efficient, concurrentprocessing since no additional phase of processing is required for theformatted units 306 with the aligned application units 304, comparedwith the non-aligned units.

Referring now to FIG. 9, therein is shown an example of an architecturalview of the computing system 900 in an embodiment. The computing system900 can be an embodiment of the computing system 100 of FIG. 1.

In this embodiment as an example, FIG. 9 depicts the in-storageprocessing coordinator 904 in a centralized coordination model. In thismodel, the in-storage processing coordinator 904 is separate from orexternal to the host computer 102 and the storage devices 218. The termseparate and external represents that the in-storage processingcoordinator 904 is in a separate system to the host computer 102 and thestorage devices 218 can be housed separate system housing.

In this example, the host computer 102 can be executing the application202 of FIG. 2. The host computer 102 can also provide file and objectservices. Further to this example, the in-storage processing coordinator904 can be included as part of the network 120 of FIG. 1, the datastorage system 101 of FIG. 1, implemented external to the host computer102, or a combination thereof. As previously described in FIG. 2 andother figures earlier, the in-storage processing coordinator 904 caninclude the request distributor 910, the data preprocessor 908, and theoutput coordinator 912.

Continuing with this example, each of the storage devices 218 performsthe in-storage processing functions. Each of the storage devices 218 caninclude an in-storage processing engine 922. The in-storage processingengine 922 can perform the in-storage processing for its respectivestorage device 218.

The storage devices 218 can be located in a number of places within thecomputing system 100. For example, the storage devices 218 can belocated within the data storage system 101 of FIG. 1, as part of thenetwork 120 of FIG. 1, the hard disk drive 116 of FIG. 1 or storageexternal to the host computer 102, or as part of the network attachedstorage 122 of FIG. 1.

In various embodiments in a centralized coordination model as in thisexample, the in-storage processing coordinator 904 can function with thestorage devices 218 in a number of ways. For example, the storagedevices 218 can be configured to support various functions, such as RAID0, 1, 2, 3, 4, 5, 6, and object stores.

The in-storage processing engine 922 can be implemented in a number ofways. For example, in-storage processing engine 922 can be implementedwith software, hardware circuitry, or a combination thereof. Examples ofhardware circuitry can include a processor, an application specificintegrated circuit (ASIC) an embedded processor, a microprocessor, ahardware control logic, a hardware finite state machine (FSM), a digitalsignal processor (DSP), FPGA, or a combination thereof.

Referring now to FIG. 10, therein is shown an example of anarchitectural view of the computing system 1000 in a further embodiment.The computing system 1000 can be an embodiment of the computing system100 of FIG. 1.

In this embodiment as an example, FIG. 10 depicts the in-storageprocessing coordinator 1004 in a centralized coordination model. In thismodel, the in-storage processing coordinator 1004 is internal to thehost computer 102. The term internal represents that the in-storageprocessing coordinator 1004 is in the same system to the host computer102 and is generally housed in the same system housing as the hostcomputer 102. This embodiment also has the in-storage processingcoordinator 1004 as a separate from or external to the storage devices218.

In this embodiment as an example, the host computer 102 can include thein-storage processing coordinator 1004 as well as the file objectservices. In this example, the host computer 102 can execute theapplication 202 of FIG. 2. As previously described in FIG. 2 and otherfigures earlier, the in-storage processing coordinator 1004 can includethe request distributor 1010, the data preprocessor 1008, and the outputcoordinator 1012.

Continuing with this example, each of the storage devices 218 performsthe in-storage processing function. Each of the storage devices 218 caninclude an in-storage processing engine 1022. The in-storage processingengine 1022 can perform the in-storage processing for its respectivestorage device 218.

The storage devices 218 can be located in a number of places within thecomputing system 100. For example, the storage devices 218 can belocated within the data storage system 101 of FIG. 1, as part of thenetwork 120 of FIG. 1, the hard disk drive 116 of FIG. 1 or storageexternal to the host computer 102 or as part of the network attachedstorage 122 of FIG. 1.

Various embodiments in a centralized model as in this example, thein-storage processing coordinator 1004 can function with the storagedevices 218 in a number of ways. For example, the storage devices 218can be configured to support various functions, such as RAID 0, 1, 2, 3,4, 5, 6, and object stores.

The in-storage processing engine 1022 can be implemented in a number ofways. For example, in-storage processing engine 1022 can be implementedwith software, hardware circuitry, or a combination thereof. Examples ofhardware circuitry can include similar examples as in FIG. 9. Thefunctions for this embodiment will be described in detail later.

Referring now to FIG. 11, therein is shown an example of an architectureview of the computing system 1100 in a yet further embodiment. Thecomputing system 1100 can be an embodiment of the computing system 100of FIG. 1.

In this embodiment as an example, FIG. 11 depicts the in-storageprocessing coordinator 1104 in a decentralized coordination model. Inthis example, the in-storage processing coordinator 1104 is partitionedbetween the host computer 102 and the storage devices 218. Additionalexamples of operational flow for this model are described in FIG. 15 andin FIG. 16.

As previously described in FIG. 2 and other figures earlier, thein-storage processing coordinator 1104 can include the requestdistributor 1110, the data preprocessor 1108, or a combination thereof.In this embodiment as an example, the data preprocessor 1108 and atleast a portion of the request distributor 1110 are internal to the hostcomputer 102. The term internal represents that the request distributor1110 and the data preprocessor 1108 are in the same system to the hostcomputer 102 and housed in the system housing as the host computer 102.

Also, this embodiment has the output coordinator 1112 and at least aportion of the request distributor 1110 separate or external to the hostcomputer 102. As a specific example, this embodiment provides the outputcoordinator 1112 and at least a portion of the request distributor 1110as internal to the storage devices 218.

In this example, the host computer 102 can execute the application 202of FIG. 2. Continuing with this example, each of the storage devices 218performs the in-storage processing function. Each of the storage devices218 can include an in-storage processing engine 1122. The in-storageprocessing engine 1122 can perform the in-storage processing for itsrespective storage device 218.

The storage devices 218 can be located in a number of places within thecomputing system 100. For example, the storage devices 218 can belocated within the data storage system 101 of FIG. 1, as part of thenetwork 120 of FIG. 1, the hard disk drive 116 of FIG. 1 or storageexternal to the host computer 102 or as part of the network attachedstorage 122 of FIG. 1.

In various embodiments in a decentralized model as in this example, thispartition of the in-storage processing coordinator 1104 can functionwith the storage devices 218 in a number of ways. For example, thestorage devices 218 can be configured to support various functions, suchas RAID 1 and object stores.

The in-storage processing engine 1122 can be implemented in a number ofways. For example, in-storage processing engine 1122 can be implementedwith software, hardware circuitry, or a combination thereof. Examples ofhardware circuitry can include similar examples as in FIG. 9.

Referring now to FIG. 12, therein is shown an example of an operationalview of the computing system 100 for in-storage processing in acentralized coordination model. FIG. 12 can represent embodiments forthe centralized coordination model described from FIG. 9 or FIG. 10.

FIG. 12 depicts the in-storage processing coordinator 204 and theinteraction between the request distributor 210 and the datapreprocessor 208 for the centralized coordination model. FIG. 12 alsodepicts the output coordinator 212. FIG. 12 also depicts the in-storageprocessing coordinator 204 interacting with the storage devices 218.

As an operational example, FIG. 12 depicts the in-storage processingcoordinator 204 issuing the device requests 1202 for in-storageprocessing, such as write requests to the storage devices 218. Therequest distributor 210 can receive the application requests 220 of FIG.2 for writing the application data 214. The request distributor 210 canalso receive a data address 1204 as well as the transfer length 302 anda logical boundary 1206 of the application units 304. The data address1204 can represent the address for the application data 214. The logicalboundary 1206 represents the length or size of each of the applicationunits 304.

Continuing with the example, the request distributor 210 can sendinformation to the data preprocessor 208 to translate the applicationdata 214 to the formatted data 216. The request distributor 210 can alsosend the transfer length 302 for the application data 214. Theapplication data 214 can be sent to the data preprocessor 208 as theapplication units 304 or the logical boundaries to the application units304.

Furthering the example, the data preprocessor 208 can translate theapplication data 214 or the application units 304 to generate theformatted data 216 or the formatted units 306 of FIG. 3. Examples of thetypes of translation can be one of the methods described in FIG. 2 andFIG. 3 through FIG. 6. The data preprocessor 208 can return theformatted data 216 or the formatted units 306 to the request distributor210. The request distributor 210 can generate and issue device requests1202 for writes to the storage devices 218 based on the formattingpolicies and policy for storing or for in-storage processing of theformatted data 216 or the formatted units 306. The device requests 1202are based on the application requests 220.

Further continuing with the example, each of the storage devices 218 caninclude an in-storage processing function or application and thein-storage processing engine 922. Each of the storage devices 218 canreceive the device requests 1202 and at least a portion of the formatteddata 216.

For illustrative purposes, although FIG. 12 depicts the device requests1202 being issued to all of the storage devices 218, it is understoodthat the request distributor 210 can operate differently. For example,the device requests 1202 can be issued to some of the storage devices218 and not necessarily to all of them. Also for example, the devicerequests 1202 can be issued at different times or can be issued as partof the error handling examples as discussed in FIG. 2.

As a specific example for a centralized coordination model, thein-storage processing coordinator 204 can receive all the applicationrequests 220 from the application 202, can issue all the device requests1202 to the storage devices 218, or a combination thereof. The requestdistributor 210 can send or distribute the device requests 1202 tomultiple storage devices 218 based on a placement scheme. The outputcoordinator 212 can collect and manage the in-storage processing outputs224 from the storage devices 218. The output coordinator 212 can thensend the in-storage processing outputs 224 to the application 202 ofFIG. 2 as similarly described in FIG. 7.

Referring now to FIG. 13, therein is shown an example of an operationalview of the computing system 1300 issuing data write requests to thestorage devices 1318 for in-storage processing in a decentralizedcoordination model. The computing system 1300 can include similaritiesto the computing system 1100 of FIG. 11. FIG. 13 depicts the in-storageprocessing coordinator 1304 including the request distributor 1310 andthe data preprocessor 1308.

Both FIG. 12 and FIG. 13 depict an example of an operational view ofcomputing system 1300 in terms of storing data to the storage devices218. That is, both FIGS. 12 and 13 focus on how to efficiently storedata across the storage devices 218 for in-storage processing.

FIG. 13 also depicts the output coordinator 1312 and a portion of therequest distributor 1310 in each of the devices 1318. FIG. 13 alsodepicts the in-storage processing coordinator 1304 interacting with thedevices 1318.

As an operational example, FIG. 13 depicts the in-storage processingcoordinator 1304 issuing the device requests 1302 as write requests tothe devices 1318. The request distributor 1310 in the in-storageprocessing coordinator 1304 can receive the application requests 220 ofFIG. 2 for writing the application data 214 of FIG. 2. The requestdistributor 1310 can also receive a data address 1204 as well as thetransfer length 302 of FIG. 3 and the logical boundary of theapplication units 304 of FIG. 3. The data address 1204 can represent theaddress for the application data 214.

Continuing with the example, the request distributor 1310 can sendinformation to the data preprocessor 1308 to translate the applicationdata 214 to the formatted data 216 of FIG. 2. The request distributor1310 can also send the transfer length 302 for the application data 214.The application data 214 can be sent as the application units 304 or thelogical boundaries to the application units 304 to the data preprocessor1308.

Furthering the example, the data preprocessor 1308 can translate theapplication data 214 or the application units 304 to generate theformatted data 216 or the formatted units 306 of FIG. 3. Examples of thetypes of translation can be one of the methods described in FIG. 2 andFIG. 3 through FIG. 6. The data preprocessor 1308 can return theformatted data 216 or the formatted units 306 to the request distributor1310 in the in-storage processing coordinator 1304. The requestdistributor 1310 can generate and issue the application requests 220 forwrites to the devices 1318 based on the formatting policies and policyfor storing or for in-storage processing of the formatted data 216 orthe formatted units 306.

Further continuing with the example, each of the devices 1318 caninclude an in-storage processing function or application and thein-storage processing engine 1322. Each of the devices 1318 can receivethe device requests 1302 and at least a portion of the formatted data216. Each of the devices 1318 can also include the output coordinator1312, a portion of the request distributor 1310, or a combinationthereof.

For illustrative purposes, although FIG. 13 depicts the device requests1302 being issued to all of the devices 1318, it is understood that therequest distributor 1310 can operate differently. For example, thedevice requests 1302 can be issued to some of the devices 1318 and notnecessarily to all of them. Also for example, the device requests 1302can be issued at different times or can be issued as part of the errorhandling examples as discussed in FIG. 2.

As a specific example for a decentralized coordination model, thein-storage processing coordinator 1304 can receive the applicationrequests 220 from the application 202, can issue the device requests1302 to the devices 1318, or a combination thereof. The requestdistributor 1310 in the in-storage processing coordinator 1304 can sendor distribute the device requests 1302 to multiple devices 1318 based ona placement scheme.

Continuing with the specific example, the request distributor 1310 ineach of the devices 1318 can receive the request from the in-storageprocessing coordinator 1304. The output coordinator 1312 can collect andmanage the in-storage processing outputs 224 from the devices 1318 orone of the devices 1318.

Also as a specific example for a decentralized coordination model, thereare various communication methods depending on the configuration of thestorage group 206. The functions of the request distributor 1310 and theoutput coordinator 1312 in the devices 1318 in a decentralizedcoordination model will be described later.

Referring now to FIG. 14, therein is shown an operational view for thecomputing system 100 for in-storage processing in a centralized model.FIG. 14 depicts the in-storage processing coordinator 904 to be externalto both the host computer 102 and the storage devices 218. Although theapplication 202 is shown outside of the host computer 102, it isunderstood that the application 202 can be executed by the host computer102 as well as outside of the host computer 102. In addition, althoughthe in-storage processing coordinator 904 is external to the host inFIG. 14, it is also understood that the in-storage processingcoordinator 904 can be internal to the host, like in FIG. 10.

FIG. 14, FIG. 15, and FIG. 16 depict an example of an operational viewof computing system 1300 of FIG. 13 in terms of processing data in thestorage devices 218. That is, FIGS. 14, 15, and 16 focus on how toefficiently process/compute the stored data in the storage devices 218with in-storage processing techniques.

In this example, the application 202 can issue application requests 220for in-storage processing to the host computer 102. The host computer102 can issue host requests 1402 based on the application requests 220from the application 202. The host requests 1402 can be sent to thein-storage processing coordinator 904.

The in-storage processing coordinator 904 can translate the applicationdata 214 of FIG. 2 and the application units 304 of FIG. 3 to generatethe formatted data 216 of FIG. 2 and the formatted units 306 of FIG. 3.The in-storage processing coordinator 904 can also generate the devicerequests 1202 to the storage devices 218. The in-storage processingcoordinator 904 can also collect and manage the in-storage processingoutputs 224 from the storage devices 218, and can deliver an aggregatedoutput 1404 back to the host computer 102, the application 202, or acombination thereof. The aggregated output 1404 is the combination ofthe in-storage processing outputs 224 from the storage devices 218. Theaggregated output 1404 can be more than concatenation of the in-storageprocessing outputs 224.

As a specific example, the in-storage processing coordinator 904 caninclude the request distributor 910. The request distributor 910 canreceive the application requests 220 as the host requests 1402. Therequest distributor 910 can generate the device requests 1202 from thehost requests 1402. The request distributor 910 can also generate thesub-application requests 222 of FIG. 7 as the device requests 1202.

As a further specific example, the in-storage processing coordinator 904can include the data preprocessor 908. The data preprocessor 908 canreceive the information from the application requests 220 or the hostrequests 1402 through the request distributor 910. The data preprocessor908 can format the application data 214 as appropriate based on theplacement scheme onto the storage devices 218.

Also as a specific example, the in-storage processing coordinator 904can include the output coordinator 912. The output coordinator 912 canreceive the in-storage processing outputs 224 from the storage devices218. The output coordinator 912 can generate the aggregated output 1404with the in-storage processing outputs 224. In this example, the outputcoordinator 912 can return the aggregated output 1404 to the hostcomputer 102. The host computer 102 can also return the aggregatedoutput 1404 to the application 202. The application 202 can continue toexecute and utilize the in-storage outputs 224, the aggregated output1404, or a combination thereof.

In this example, each of the storage devices 218 includes the in-storageprocessing engine 922. The in-storage processing engine 922 can receiveand operate on specific instance of the device requests 1202. Thein-storage processing engine 922 can generate in-storage processingoutput 224 to be returned to the in-storage processing coordinator 904or as a specific example to the output coordinator 912.

Referring now to FIG. 15, therein is shown an operational view for acomputing system 1500 in a decentralized model in an embodiment with oneoutput coordinator 1512. The computing system 1500 can be the computingsystem 1100 of FIG. 11.

As an operational overview of this embodiment, the host computer 102 canissue an application request 220 to the storage devices 218 forin-storage processing. The host computer 102 and the storage devices 218can be similarly partitioned as described in FIG. 11. Each of thestorage devices 218 can perform the in-storage processing. Each of thestorage devices 218 can provide its in-storage processing output 224 tothe storage device 218 that received the application request 220 fromthe host computer 102. This storage device 218 can then return anaggregated output 1504 back host computer 102, the application 202, or acombination thereof. The application 202 can continue to execute andutilize the in-storage outputs 224, the aggregated output 1504, or acombination thereof.

Continuing with the example, the application request 220 can be issuedto one of the storage devices 218. That one storage device 218 can issuethe application request 220 or a device request 1202 to the otherstorage devices 218. As an example, the storage device 218 that receivedthe application request 220 can decompose the application request 220 topartition the in-storage processing to the other storage devices 218.The device request 1202 can be that partitioned request based off theapplication request 220 and the in-storage processing execution by theprevious storage devices 218.

This example depicts a number of the devices labeled as “DEV_1”,“DEV_2”, “DEV_3”, and through “DEV_N”. The term “N” in the figure is aninteger. The storage devices 218 in this example can perform in-storageprocessing. Each of the storage devices 218 are shown including anin-storage processing engine 1522, a data preprocessor 1508, and anoutput coordinator 1512.

For illustrative purposes, all of the storage devices 218 are shown withthe output coordinator 1512, although it is understood that thecomputing system 1500 can partitioned differently. For example, only oneof the storage devices 218 can include the output coordinator 1512.Further for example, the output coordinator 1512 in each of the storagedevices 218 can operate differently from another. As a specific example,the output coordinator 1512 in DEV_2 through DEV_N can act as passthrough to the next storage device 218 or to return the in-storageprocessing output 224 back to DEV_1. Each of the storage devices 218 canmanage it request identification 710 of FIG. 7, the sub-requestidentification 708 of FIG. 7, or a combination thereof.

In this example, the host computer 102 can send the application request220 to one of the storage devices 218 labeled DEV_1. The in-storageprocessing engine 1522 in DEV_1 can perform the appropriate level ofin-storage processing and generates the in-storage processing output224. In this example, the in-storage processing output 224 from DEV_1can be referred to as a first output 1524.

Continuing with this example, the data preprocessor 1508 in DEV_1 canformat or translate the information from the application request 220that will be forwarded to DEV_2, DEV_3, and through to DEV_N. Thein-storage processing engine 1522 in DEV_2 can generate the in-storageprocessing output 224 and can be referred to a second output 1526. Theoutput coordinator 1512 in the DEV_2 can send the second output 1528 toDEV_1. The in-storage processing engine 1522 in DEV_3 can generate thein-storage processing output 224 and can be referred to a third output1528. The output coordinator 1512 in the DEV_3 can send the third output1528 to DEV_1. The in-storage processing engine 1522 in DEV_N cangenerate the in-storage processing output 224 and can be referred to anNth output. The output coordinator 1512 in the DEV_N can send the Nthoutput to DEV_1. The output coordinator 1512 in DEV_1 generates theaggregated output 1504 that includes the first output 1524, the secondoutput 1526, the third output 1528, and through the Nth output.

Referring now to FIG. 16, therein is shown an operational view for acomputing system 1600 in a decentralized model in an embodiment withmultiple output coordinators 1612. The computing system 1600 can be thecomputing system 1100 of FIG. 11.

As an operational overview of this embodiment, the host computer 102 canissue an application request 220 to storage devices 218 for in-storageprocessing. The host computer 102 and the storage devices 218 can besimilarly partitioned as described in FIG. 11. The application request220 can be issued to one of the storage devices 218. That storage device218 then performs the in-storage processing. The execution of theapplication request 220 and the in-storage processing results is issuedor sent to another of the storage devices 218. This process can continueuntil all the storage devices 218 performed the in-storage processingand the last of the storage devices 218 can return the result to thefirst of the storage devices 218. That first of the storage devices 218then returns an aggregated output 1604 back host computer 102, theapplication 202, or a combination thereof. The application 202 cancontinue to execute and utilize the in-storage outputs 224 of FIG. 2,the aggregated output 1604, or a combination thereof.

For illustrative purposes, this embodiment is described with DEV_1providing the aggregated output 1604 to the host computer 102, althoughit is understood that this embodiment can operate differently. Forexample, the last device or DEV_N in this example can provide theaggregated output 1604 back to the host computer 102 instead of DEV_1.

This example depicts a number of the storage devices 218 labeled as“DEV_1”, “DEV_2”, “DEV_3”, and through “DEV_N”. The term “N” in thefigure is an integer. The storage devices 218 in this example canperform in-storage processing. Each of the storage devices 218 are shownincluding an in-storage processing engine 1622, a data preprocessor1608, and an output coordinator 1612.

For illustrative purposes, all of the storage devices 218 are shown withthe output coordinator 1612, although it is understood that thecomputing system 1600 can partitioned differently. For example, only oneof the storage devices 218 can include the output coordinator 1612 withfull functionality. Further for example, the output coordinator 1612 ineach of the storage devices 218 can operate differently from another. Asa specific example, the output coordinator 1612 in DEV_2 through DEV_Ncan act as pass through to the next storage device 218 or to return theaggregated output 1604 back to DEV_1.

In this example, the host computer 102 can send the application request220 to one of the storage devices 218 labeled DEV_1. The in-storageprocessing engine 1622 in DEV_1 can perform the appropriate level ofin-storage processing and can generate the in-storage processing output224. In this example, the in-storage processing output 224 from DEV_1can be referred to a first output 1624. In this example, the DEV_1 candecompose the application request 220 to partition the in-storageprocessing to DEV_2. The device request 1202 of FIG. 12 can be thatpartitioned request based off the application request 220 and thein-storage processing execution DEV_1. This process of decomposing andpartitioning can continue through DEV_N.

Continuing with this example, the data preprocessor 1608 in DEV_1 canformat or translate the information from the application request 220that will be forwarded to DEV_2. The data preprocessor 1608 in DEV_1 canalso format or translate the in-storage processing output 224 from DEV_1or the first output 1624.

Furthering this example, the output coordinator 1612 in DEV_1 can sendthe output of the data preprocessor 1608 in DEV_1, the first output1624, a portion of the application request 220, or a combination thereofto DEV_2. DEV_2 can continue the in-storage processing of theapplication request 220 sent to DEV_1.

Similarly, the in-storage processing engine 1622 in DEV_2 can performthe appropriate level of in-storage processing based on the first output1624 and can generate the in-storage processing output 224 from DEV_2.In this example, the in-storage processing output 224 from DEV_2 can bereferred to a second output 1626 as “a partial aggregated output.”

Continuing with this example, the data preprocessor 1608 in DEV_2 canformat or translate the information from the application request 220 orthe second output 1626 that will be forwarded to DEV_3. The datapreprocessor 1608 in DEV_2 can also format or translate the in-storageprocessing output 224 from DEV_2 or the second output 1626.

Furthering this example, the output coordinator 1612 in DEV_2 can sendthe output of the data preprocessor 1608 in DEV_2, the second output1626, a portion of the application request 220, or a combination thereofto DEV_3. DEV_3 can continue the in-storage processing of theapplication request 220 sent to DEV_1.

Similarly, the in-storage processing engine 1622 in DEV_3 can performthe appropriate level of in-storage processing based on the secondoutput 1626 and can generate the in-storage processing output 224 fromDEV_3. In this example, the in-storage processing output 224 from DEV_3can be referred to a third output 1628.

Continuing with this example, the data preprocessor 1608 in DEV_3 canformat or translate the information from the application request 220 orthe third output 1628 that will be forwarded to DEV_1. The datapreprocessor 1608 in DEV_2 an also format or translate the in-storageprocessing output 224 from DEV_3 or the third output 1628.

Furthering this example, the output coordinator 1612 in DEV_3 can sendthe output of the data preprocessor 1608 in DEV_3, the third output1628, a portion of the application request 220, or a combination thereofto DEV_1. DEV_1 can return to the host computer 102 or the application202 the aggregated output 1604 based on the first output 1624, thesecond output 1626, and the third output 1628.

In this example, in-storage processing by one of the storage devices 218that follows a previous storage device 218 can aggregate the in-storageprocessing outputs 224 of the storage devices 218 that preceded it. Inother words, the second output 1626 is an aggregation of the in-storageprocessing output 224 from the DEV_2 as well as the first output 1624.The third output 1628 is an aggregation of the in-storage processingoutput from DEV_3 as well as the second output 1626.

Referring now to FIG. 17, therein is shown an example of a flow chartfor the request distributor 210 and the data preprocessor 208. Therequest distributor 210 and the data preprocessor 208 can be operated ina centralized or decentralized model as described earlier, as examples.

As an overview of this example, this flow chart depicts how theapplication data 214 of FIG. 2 can be translated to the formatted data216 of FIG. 2 based on the storage policies. As examples, the storagepolicies can include the split policy, the split+padding policy, thesplit+redundancy policy, and storage without any chunking of theapplication units 304 of FIG. 3 to the formatted units 306 of FIG. 3.This example can represent the application request 220 of FIG. 2 as awrite request.

The request distributor 210 of FIG. 2 can receive the applicationrequest 220 directly or some form the application request 220 throughthe host computer 102 of FIG. 1. The application request 220 can includeinformation such as the data address 1204 of FIG. 12, the applicationdata 214, the transfer length 302 of FIG. 3, the logical boundary 1206of FIG. 12, or a combination thereof.

As an example, the request distributor 210 can execute a chunkcomparison 1702. The chunk comparison 1702 compares the transfer lengthwith a chunk size 1704 of the storage group 206, in this exampleoperating as a RAID system. The chunk size 1704 represents a discreteunit of storage size to be stored in the storage devices 218 of FIG. 2in the storage group 206 of FIG. 2. As an example, the chunk size 1704can represent the size of one of the formatted units 306.

If the chunk comparison 1702 determines the transfer length 302 isgreater than the chunk size 1704, the handling of the applicationrequest 220 can continue to a boundary query 1706. If the chunkcomparison 1702 determines that the transfer length is not greater thanthe chunk size 1704, the handling of the application request 220 cancontinue to a device selection 1708.

The branch of the flow chart starting with the device selection 1708represents the handling of the application data 214 without chunking ofthe application units 304 or the application data 214. An example ofthis can be the mirroring function as described in FIG. 6.

Continuing with this branch of the flow chart, the device selection 1708determines which of the storage devices 218 in the storage group 206will store the application data 214 as part of the application request220. The request distributor 210 can generate the device requests 1202of FIG. 12 as appropriate based on the application request 220.

When the logical boundary 1206 of FIG. 12 for the application units 304are included with the application request 220, the request distributor210 can distribute the application request 220 by splitting theapplication request 220 to sub-application requests 222 of FIG. 2 or bysending identical application requests 220 to multiple storage devices218.

In the example for the sub-application requests 222, each of thesub-application requests 222 can make the size of each of thesub-application requests 222 to be a multiple of the logical boundary1206 of the application units 304. The sub-application requests 222 canbe the device requests 1202 issued to the storage devices 218.

In the example for identical application requests 220, multiple storagedevices 218 can receive these application requests 220. The firstin-storage processing output 224 of FIG. 2 returned can be accepted bythe output coordinator 212 of FIG. 2 to be returned back to theapplication 202. The identical application requests 220 can be thedevice requests 1202 issued to the storage devices 218.

When the logical boundary 1206 for the application units 304 is notincluded, the request distributor 210 can split the application request220 to the sub-application requests 222. These sub-application requests222 make the size of each of these requests to be an arbitrary length.The requests can be handled as a split function by the data preprocessor208. The sub-application requests 222 can be the device requests 1202issued to the storage devices 218.

The request distributor 210, the data preprocessor 208, or a combinationthereof can continue from the device selection 1708 to an addresscalculation 1710. The address calculation 1710 can calculate the addressfor the application data 214 or the formatted data 216 to be stored inthe storage devices 218 receiving the device requests 1202. Forillustrative purposes, the address calculation 1710 is described beingperformed by the request distributor 210 or the data preprocessor 208,although it is understood that the address calculation 1710 can beperformed elsewhere. For example, the storage devices 218 receiving thedevice requests 1202 can perform the address calculation 1710. Also forexample, the address can be a pass-through from the application request220 in which case the address calculation 1710 could have been performedby the application 202 of FIG. 2 or by the host computer 102.

The flow chart can continue to a write non-chunk function 1712. Each ofthe storage devices 218 receiving the device request 1202 can write theapplication data 214 or the formatted data 216 on the storage device218. Since each of the storage devices 218 contain the application data214 in a complete or non-chunked form, any of the application data 214or the formatted data 216 can undergo in-storage processing by thestorage device 218 with the application data 214.

Returning to the branch of the flow chart from the boundary query 1706,the boundary query 1706 determines if the logical boundary 1206 isprovided in the application request 220, as an example. If the boundaryquery 1706 determines that the logical boundary 1206 is provided, theflow chart can continue to a padding query 1714. If the boundary query1706 determines that the logical boundary 1206 is not provided, the flowchart can continue to a normal RAID query 1716.

The branch of the flow chart starting with the normal RAID query 1716represents the handling of the application data 214 with chunking of theapplication units 304 (or some of the application units 304). An exampleof this can be the split function described in FIG. 3. As an example,this branch of the flow chart can be used for unstructured applicationdata 214 or for application data 214 with no logical boundary 1206. Thechunk size 1704 can be with a fixed size or a variable-length size.

Continuing with this branch of the flow chart, the normal RAID query1716 determines if the application request 220 is for a normal RAIDfunction as the in-storage processing, or not. If so, the flow chart cancontinue to a chunk function 1718. If not the flow chart can continue toanother portion of the flow chart or can return an error status back tothe application 202.

In this example, the chunk function 1718 can split the application data214 or the application units 304 or some portion of them in the chunksize 1704 for the storage devices 218 to receive the application data214. As an example, the data preprocessor 208 can perform the chunkfunction 1718 to generate the formatted data 216 or the formatted units306 with the application data 214 translated to the chunk size 1704. Thedata preprocessor 208 can interact with the request distributor 210 toissue the device requests 1202 to the storage devices 218.

For illustrative purposes, the chunk function 1718 is described as beingperformed by the data preprocessor 208, although it is understood thatthe chuck function 1718 can be executed differently. For example, thestorage devices 218 receiving the device requests 1202 can perform thechunk function 1718 as part of the in-storage processing at the storagedevices 218.

In this example, the flow chart can continue to a write chunk function1719. The write chunk function 1719 is an example of the in-storageprocessing at the storage devices 218. The write chunk function 1719writes the formatted data 216 or the formatted units 306 at the storagedevices 218 receiving the device requests 1202 from the requestdistributor 210.

Returning to the branch of the flow chart from the padding query 1714,the branch below the padding query 1714 represents the handling of theapplication data 214 or the application units 304 or a portion thereofwith the data pads 402. An example of this can be the split+paddingfunction as described in FIG. 4.

The padding query 1714 determines if the application data 214 or theapplication units 304 or some portion of them should be padded togenerate the formatted data 216 or the formatted units 306. The datapreprocessor 208 can perform the padding query 1714.

When the padding query 1714 determines that padding of the applicationunits 304 is needed, the flow chart can continue to an application datasizing 1720. The application data sizing 1720 calculates a data size1722 of the application data 214 for the split—padding function. Thedata size 1722 is the amount of the application data 214 to bepartitioned for the formatted data 216. As an example, the applicationdata sizing 1720 can determine the data size 1722 for the amount of theapplication unit 304 or multiple application units 304 for each of theformatted units 306. In this example, each of the formatted units 306are of the chunk size 1704 and the data size 1722 is per chunk.

As a specific example, the data size 1722 can calculated with Equation 1below.

data size 1722=(floor(chunk size 1704/logical boundary 1206))×logicalboundary 1206  (Equation 1)

In other words, the data size is calculated with the floor function ofthe chunk size 1704 divided by the logical boundary 1206. The result ofthe floor function is then multiplied by the logical boundary 1206 togenerate the data size 1722.

The flow chart can continue to a pad sizing 1724. The pad sizing 1724calculates a pad size 1726 for the data pads 402 for each of theformatted units 306. As an example, the pad size 1726 can be calculatedwith Equation 2 below.

pad size 1726=chunk size 1704−data size 1722  (Equation 2)

In the words, the pad size 1726 per chunk or per each of the formattedunits 306 can be calculated with the chunk size 1704 subtracted by thedata size 1722 per chunk or per each of the formatted units 306.

The flow chart can continue to a chunk number calculation 1728. Thechunk number calculation 1728 determines a chunk number 1730 or thenumber of the formatted units 306 needed for the application data 214.The chunk number 1730 can be used to determine the size or length of theformatted data 216. The data preprocessor 208 can perform the chunknumber calculation 1728.

The flow chart can continue to a split function 1732. The split function1732 partitions the application data 214 to the data size 1722 for eachof the formatted units 306. The split function 1732 is part ofgenerating the formatted data 216 where the application units 304 arealigned with the chunk size 1704 or the formatted units 306. The datapreprocessor 208 can perform the split function 1732.

The flow chart can continue to a write pad function 1734. The write padfunction 1734 performs the in-storage processing of writing theformatted data 216 with the application data 214 partitioned to the datasize 1722 and with the data pads 402. The data pads 402 can includeadditional information, such as parity, metadata, synchronizationfields, or identification fields. The request distributor 210 can sendthe device requests 1202 to the storage devices 218 to perform the writepad function 1734 of the formatted data 216.

Returning to the padding query 1714, when the padding query 1714determines that padding of the application units 304 is not needed, theflow chart can continue to a redundancy query 1736. When the redundancyquery 1736 determines that redundancy of the application data 214 isneeded, then this branch of the flow chart represents the redundancyfunction. As an example, the redundancy function is described in FIG. 6.

The flow chart can continue from the redundancy query 1736 to theapplication data sizing 1720. As an example, FIG. 17 depicts theapplication data sizing 1720 under the redundancy query 1736 to be aseparate function from the application data sizing 1720 under thepadding query 1714, although it is understood that the two functions canperform the same operations and can also be the same function. Theapplication data sizing 1720 under the redundancy query 1736 can becomputed using the expression found in Equation 1 described earlier.

The flow chart can continue to a chunk function 1718. The chunk function1718 splits or partitions the application data 214 to the formatted data216 as described in FIG. 6. The data preprocessor 208 can perform thechunk function 1718. As an example, FIG. 17 depicts the chunk function1718 under the normal RAID query 1716 to be a separate function from thechunk function 1718 under the redundancy query 1736, although it isunderstood that the two functions can perform the same operations andcan also be the same function.

The flow chart can continue to a redundancy function 1738. For eachchunk or for each of the formatted units 306, the redundancy function1738 copies that application data 214 that is in the range of the datasize 1722 and the chunk size 1704 to additional chunks to generate thereplica data 602 of FIG. 6.

The flow chart can continue to a write redundancy function 1740. Thewrite redundancy function writes formatted data 216 including theapplication data 214 and the replica data 602. The request distributor210 as issue device requests 1202 to the storage devices 218 to performthe write redundancy function 1740. Returning to the branch with theredundancy query 1736, when the redundancy query 1736 determines thatredundancy is not needed, the flow chart can continue to the normal RAIDquery 1716.

For illustrative purposes, the flow chart is described with thesplit+padding function separately from the redundancy function, althoughit is understood that the flow chart can provide a different operation.For example, the flow chart can be arranged to provide thesplit+redundancy function as described in FIG. 5. As an example, thiscan be accomplished with the redundancy query 1736 being placed beforethe write pad function 1734. Furthering this example, the redundancyfunction 1738 above could be modified to operate only on the non-alignedapplication units 304 to form the aligned data 504 of FIG. 5 as opposedto the replica data 602. The modified redundancy function can befollowed by a further write function. The further write function wouldcombine portions of the write pad function 1734 and the write redundancyfunction 1740. The write pad function 1734 be utilize a portion of theformatted data 216 with the data pads 402 and the write redundancyfunction 1740 can write the aligned data 504 as opposed to the replicadata 602.

Referring now to FIG. 18, therein is shown an example of a flow chartfor a mirroring function for centralized and decentralized embodiments.As examples, the centralized embodiment can be the computing system 900of FIG. 9 or the computing system 1000 of FIG. 10. As an example, thedecentralized embodiment can be the computing system 1100 of FIG. 11.

The flow chart on the left-hand side of FIG. 18 represents an example ofa flow chart for a centralized embodiment. The flow chart on theright-hand side of FIG. 18 represents an example of a flow chart for adecentralized embodiment.

Starting with the centralized embodiment, the request distributor 210 ofFIG. 2 can receive the application request 220 of FIG. 2. Theapplication request 220 can include the data address 1204 of FIG. 12,the application data 214 of FIG. 2, and the transfer length 302 of FIG.3.

For example, the data preprocessor 208 of FIG. 2 can execute a replicaquery 1802. The replica query 1802 determines if the replica data 602 ofFIG. 6 should be created or not. As an example, the replica query 1802can make this determines by comparing if a number 1804 of replica data602 being requested is greater than zero. If so, the flow chart cancontinue to a create replica 1806. If not, the flow chart can continueto the device selection 1708.

As an example, the device selection 1708 can be the same function orperform the same or similar function as described in FIG. 17. The flowchart can continue to the address calculation 1710. As with the deviceselection 1708, the address calculation 1710 can be the same function orperform the same or similar function as described in FIG. 17. The flowchart can continue to the write non-chunk function 1712. As with theaddress calculation 1710, the write non-chunk function 1712 can be thesame function or perform the same or similar function as described inFIG. 17.

As an example, the request distributor 210 can execute the deviceselection 1708, the address calculation 1710, or a combination thereofinclude the outputs of these operations as part of the device request1202 of FIG. 12. The write non-chunk function 1712 can be performed byone of the storage devices 218 to store the application data 214.

Returning to the replica query 1802, when the replica query 1802determines the replica data 602 of FIG. 6 should be generated, then theflow chart can continue to the create replica 1806. As an example, thereplica query 1802 can make this determination when the number 1804 ofreplica sought is greater than zero.

In this example, the create replica 1806 can generate the replica data602 from the application data 214. The replica data 602 can be asdescribed in FIG. 6. As an example, the data preprocessor 208 canperform the create replica 1806. The create replica 1806 can generatethe number 1804 of the replica data 602 as needed and not just one.

The flow chart can continue to a prepare replica 1808. As an example,the request distributor 210 can prepare each of the replica data 602 forthe device selection 1708. The replica data 602 can be written to thestorage devices 218 following the flow chart from the device selection1708, as already described.

Returning to the flow chart for the decentralized embodiment on theright-hand side of FIG. 18, the request distributor 210 can receive theapplication request 220. The application request 220 can include thedata address 1204, the application data 214, the transfer length 302,and the number 1804 of the replica data 602.

The request distributor 210 can send one of the device requests 1202 toone of the storage devices 218. That storage device 218 can perform theaddress calculation 1710. As an example, the address calculation 1710can be the same function or perform the same or similar function asdescribed in FIG. 17 and as for the centralized embodiment.

In this example, the same storage device 218 can also perform the writenon-chunk function 1712. As an example, the write non-chunk function1712 can be the same function or perform the same or similar function asdescribed in FIG. 17 and as for the centralized embodiment.

The flow chart can continue to the replica query 1802. As an example,the replica query can be the same function or perform the same orsimilar function as described for the centralized embodiment. If thenumber 1804 for the replica data 602 is not greater than zero, theprocess to write additional data stops for this particular applicationrequest 220.

If the replica query 1802 determines that the number 1804 for thereplica data 602 is greater than zero, then the flow chart can continuea group selection 1810. The group selection 1810 can select one of thestorage devices 218 in the same replica group 1812. The replica group1812 is a portion of the storage devices 218 of FIG. 2 in the storagegroup 206 of FIG. 2 designated to be part of a redundancy function forthe application data 214 and for in-storage processing. The requestdistributor 210 can perform the replica query 1802, the group selection1810, or a combination thereof.

The flow chart can continue to a number update 1814. The number update1814 can decrement the number 1804 for replica data 602 still to bewritten to the replica group 1812. The decrement amount can be by aninteger value, such as one. The request distributor 210 can perform thenumber update 1814.

The flow chart can continue to a request generation 1816. The requestgeneration 1816 generates one of the device requests 1202 to another ofthe storage devices 218 in the replica group 1812 for writing thereplica data 602. The request distributor 210 can perform the requestgeneration 1816.

The flow chart can loop back (not drawn in FIG. 18) to the replica query1802 and iterate until the number 1804 has reached zero. At this point,the replica data 602 has been written to the replica group 1812.

For illustrative purposes, the decentralized embodiment is described asoperating in a serial manner writing to one of the storage devices 218at a time, although it is understood that the decentralized embodimentcan operate differently. For example, the request distributor 210 canissue a number of device requests 1202 to the storage devices 218 in thereplica group 1812 and have the replica data 602 written on multiplestorage devices 218 simultaneously before the other storage devices 218in the replica group completes the write.

It has been discovered that the computing system provides efficientdistributed processing by providing methods and apparatuses forperforming in-storage processing with multiple storage devices, withcapabilities for performing in-storage processing of application data.An execution of an application can be shared by distributing theexecution among various devices in a storage device. Each of the devicescan perform in-storage processing with the application data as requestedby an application request.

It has also been discovered that the computing system can reduce overallsystem power consumption by reducing the number of inputs/outputsbetween the application execution and the storage device. This reductionis achieved by having the devices perform the in-storage processinginstead of mere storage, read, and re-store by the application. Instead,the in-storage processing outputs can be returned as an aggregatedoutput from the various devices that performed the in-storage processingback to the application. The application can continue to execute andutilize the in-storage outputs, the aggregated output, or a combinationthereof.

It has been discovered that the computing system provides for reducedtotal cost of ownership by providing formatting and translation functionof the application data for different configuration or organization ofthe storage device. Further, the computing system also providestranslation for the type of in-storage processing to be carried out bythe devices in the storage device. Examples of types of translation orformatting include split, split+padding, split+redundancy, andmirroring.

It has been discovered that the computing system provides more efficientexecution of the application with less interrupts to the application viathe output coordination of the in-storage processing outputs from thestorage devices. The output coordination can buffer the in-storageprocessing outputs and can also sort the order of each of the in-storageprocessing outputs before returning an aggregated output to theapplication. The application can continue to execute and utilize thein-storage outputs, the aggregated output, or a combination thereof.

It has been discovered that the computing system further minimizesintegration obstacles by allowing the devices in the storage group tohave different or the same functionalities. As an example, one of thedevices can function as the only output coordinator for all thein-storage processing outputs from the other devices. As a furtherexample, the aggregation function can be distributed amongst the devicespassing along and performing partial aggregation from device to deviceuntil one of the devices returns the full aggregated output back to theapplication. The application can continue to execute and utilize thein-storage outputs, the aggregated output, or a combination thereof.

The modules described in this application can be hardwareimplementations or hardware accelerators in the computing system 100.The modules can also be hardware implementation or hardware acceleratorswithin the computing system 100 or external to the computing system 100.

The modules described in this application can be implemented asinstructions stored on a non-transitory computer readable medium to beexecuted by the computing system 100. The non-transitory computer mediumcan include memory internal to or external to the computing system 100.The non-transitory computer readable medium can include non-volatilememory, such as a hard disk drive, non-volatile random access memory(NVRAM), solid-state storage group (SSD), compact disk (CD), digitalvideo disk (DVD), or universal serial bus (USB) flash memory devices.The non-transitory computer readable medium can be integrated as a partof the computing system 100 or installed as a removable portion of thecomputing system 100.

Referring now to FIG. 19, therein is shown a flow chart of a method 1900of operation of a computing system 100 in an embodiment of the presentinvention. The method 1900 includes: performing in-storage processingwith a storage device with formatted data based on application data froman application in a block 1902; and returning an in-storage processingoutput from the storage device to the application for continuedexecution in a block 1904.

The method 1900 can further include receiving a sub-application requestat the storage device based on an application request from theapplication for performing in-storage processing. The method 1900 canfurther include sorting in-storage processing outputs from a storagegroup including the storage device. The method 1900 can further includeissuing a device request based on an application request from theapplication to a storage group including the storage device.

The method 1900 can further include issuing a device request from thestorage device; receiving the device request at another storage device;generating another device request by the another storage device; andreceiving the another device request by yet another storage device

The method 1900 can further include sending in-storage processingoutputs by a storage group include the storage device to be aggregatedand sent to the application. The method 1900 can further includeaggregating an in-storage processing output as a partial aggregatedoutput to be returned to the application. The method 1900 can furtherinclude generating the formatted data based on the application data. Themethod 1900 can further include generating a formatted unit of theformatted data with an application unit of the application data and adata pad. The method 1900 can further include generating a formattedunit of the formatted data with non-aligned instances of applicationunits of the application data and a data pad.

While the invention has been described in conjunction with a specificbest mode, it is to be understood that many alternatives, modifications,and variations will be apparent to those skilled in the art in light ofthe aforegoing description. Accordingly, it is intended to embrace allsuch alternatives, modifications, and variations that fall within thescope of the included claims. All matters set forth herein or shown inthe accompanying drawings are to be interpreted in an illustrative andnon-limiting sense.

What is claimed is:
 1. A computing system comprising: a storage deviceconfigured to: perform in-storage processing with formatted data basedon application data from an application; and return an in-storageprocessing output to the application for continued execution.
 2. Thesystem as claimed in claim 1 wherein the storage device is furtherconfigured to receive a sub-application request based on an applicationrequest from the application for performing in-storage processing. 3.The system as claimed in claim 1 wherein the storage device is furtherconfigured to generate an aggregated output from in-storage processingoutputs from one or more other storage devices and return the aggregatedoutput to the application for continued execution.
 4. The system asclaimed in claim 1 wherein the storage device is further configured toissue a device request based on an application request from theapplication to at least one of other storage devices.
 5. The system asclaimed in claim 1 wherein: the storage device is further configured to:issue a device request; further comprising: another storage deviceconfigured to: receive the device request, generate another devicerequest; and yet another storage device configured to receive theanother device request.
 6. The system as claimed in claim 1 furthercomprising a storage group including the storage device, configured tosend in-storage processing outputs to be aggregated and sent to theapplication.
 7. The system as claimed in claim 1 wherein the storagedevice is further configured to aggregate an in-storage processingoutput as a partial aggregated output to be returned to the application.8. The system as claimed in claim 1 wherein the storage device isfurther configured to generate the formatted data from the applicationdata.
 9. The system as claimed in claim 1 wherein the storage device isfurther configured to generate a formatted unit of the formatted datawith an application unit of the application data and a data pad.
 10. Thesystem as claimed in claim 1 wherein the storage device is furtherconfigured to generate a formatted unit of the formatted data withnon-aligned instances of application units of the application data and adata pad.
 11. A method of operation of a computing system comprising:performing in-storage processing with a storage device with formatteddata based on application data from an application; and returning anin-storage processing output from the storage device to the applicationfor continued execution.
 12. The method as claimed in claim 11 furthercomprising receiving a sub-application request at the storage devicebased on an application request from the application for performingin-storage processing.
 13. The method as claimed in claim 11 furthercomprising sorting in-storage processing outputs from a storage groupincluding the storage device.
 14. The method as claimed in claim 11further comprising issuing a device request based on an applicationrequest from the application to a storage group including the storagedevice.
 15. The method as claimed in claim 11 further comprising:issuing a device request from the storage device; receiving the devicerequest at another storage device; generating another device request bythe another storage device; and receiving the another device request byyet another storage device.
 16. The method as claimed in claim 11further comprising sending in-storage processing outputs by a storagegroup include the storage device to be aggregated and sent to theapplication.
 17. The method as claimed in claim 11 further comprisingaggregating an in-storage processing output as a partial aggregatedoutput to be returned to the application.
 18. The method as claimed inclaim 11 further comprising generating the formatted data based on theapplication data.
 19. The method as claimed in claim 11 furthercomprising generating a formatted unit of the formatted data with anapplication unit of the application data and a data pad.
 20. The methodas claimed in claim 11 further comprising generating a formatted unit ofthe formatted data with non-aligned instances of application units ofthe application data and a data pad.